Generalizing Matrix Multiplication for Efficient Computations on Modern Computers
نویسندگان
چکیده
Recent advances in computing allow taking new look at matrix multiplication, where the key ideas are: decreasing interest in recursion, development of processors with thousands (potentially millions) of processing units, and influences from the Algebraic Path Problems. In this context, we propose a generalized matrix-matrix multiply-add (MMA) operation and illustrate its usability. Furthermore, we elaborate the interrelation between this generalization and the BLAS standard.
منابع مشابه
Generalizing of a High Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures
Strassen’s algorithm to multiply two n×n matrices reduces the asymptotic operation count from O(n) of the traditional algorithm to O(n), thus designing efficient parallelizing for this algorithm becomes essential. In this paper, we present our generalizing of a parallel Strassen implementation which obtained a very nice performance on an Intel Paragon: faster 20% for n ≈ 1000 and more than 100%...
متن کاملA New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملVectorization and Parallelization of Loops in C/C++ Code
Modern computer processors can support parallel execution of a program by using their multicores. Computers can also support vector operations by using their extended SIMD instructions. To make a computer program run faster, the time-consuming loop computations in the program can often be parallelized and vectorized to utilize the capacity of multicores and extended SIMD instructions. In this p...
متن کاملFast matrix multiplication techniques based on the Adleman-Lipton model
Abstract. On distributed memory electronic computers, the implementation and association of fast parallel matrix multiplication algorithms has yielded astounding results and insights. In this discourse, we use the tools of molecular biology to demonstrate the theoretical encoding of Strassen’s fast matrix multiplication algorithm with DNA based on an n-moduli set in the residue number system, t...
متن کاملDeveloping Tensor Operations with an Underlying Group Structure
Tensor computations frequently involve factoring or decomposing a tensor into a sum of rank-1 tensors (CANDECOMP-PARAFAC, HOSVD, etc.). These decompositions are often considered as different higher-order extensions of the matrix SVD. The HOSVD can be described using the n-mode product, which describes multiplication between a higher-order tensor and a matrix. Generalizing this multiplication le...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011